# Load all required libraries
library(knitr)
library(quantmod)
library(lubridate)
library(dplyr)
library(ggplot2)
library(readr)
library(moments)
library(tseries)
library(MSGARCH)
library(gridExtra)
library(zoo)
library(plotly)
library(tidyverse)
library(doParallel)
library(foreach)
library(kableExtra)Regime-Switching Volatility Forecasting
Overview
To produce realistic volatility forecasts, I aim to combine two statistical concepts: Regime Detection (HMM) and Volatility Clustering (GARCH).
2. The “Engine”: GARCH
The Intuition: Volatility is “sticky.” If the market was wild yesterday, it is likely to be wild today. Standard statistical models miss this, but GARCH (Generalized Autoregressive Conditional Heteroskedasticity) is designed specifically to capture it.
The Mechanics: The GARCH(1,1) model forecasts volatility (\(h_t\)) based on its own history:
\[ h_t = \omega + \alpha \epsilon_{t-1}^2 + \beta h_{t-1} \]
- \(\omega\) (The Baseline): The long-term average volatility level.
- \(\alpha\) (The Reaction): How strongly the market reacts to yesterday’s shock (“Did breaking news hit the market?”).
- \(\beta\) (The Memory): How long that shock persists (“Is the panic fading away, or is it sticking around?”).
3. The Synthesis: MS-GARCH

Standard GARCH models have a major flaw: they assume the market’s behavior (the \(\alpha\) and \(\beta\) parameters) never changes. This is unrealistic—a market crash behaves fundamentally differently than a slow summer rally.
The Solution: Markov-Switching GARCH
By combining HMM and GARCH, we create a model that adapts to the environment:
- Multiple Engines: We fit two separate GARCH models. One captures the dynamics of a “Calm” market, and the other captures a “Turbulent” market.
- Dynamic Switching: The model uses the HMM probabilities to weighted-switch between these two engines in real-time.
Result: A forecast that not only predicts volatility but also adapts its rules when the market shifts gears.
Library needed
To successfully run this project, we will use the following library:
Data Preprocessing
Fetching Index Prices
First, we are going to fetch stock data for the ASX200 (^AORD) index from 2000 to the present.
Fetching Index Prices
# 1. Configuration
start_date <- "2000-01-01"
# ASX200 (^AXJO)
ticker_map <- list(
"^AXJO" = "ASX200"
)
# 3. Loop, Fetch, and Save
for (symbol in names(ticker_map)) {
data <- getSymbols(symbol, src = "yahoo", from = start_date, auto.assign = FALSE)
adj_data <- Ad(data)
names(adj_data) <- paste0(ticker_map[[symbol]], ".Adjusted")
write.zoo(adj_data, file = paste0(ticker_map[[symbol]], ".csv"), sep = ",", col.names = TRUE, index.name = "Date")
}Above, you may see how I use the library quantmod to fetch these index data and notice that I only use Adjusted Closing Price via Ad(data)
Let’s see what the data we have fetched looks like.
ASX200
Create nice HTML for ASX200
# Read the ASX200 data
asx_data <- read.csv("ASX200.csv")
DT::datatable(
asx_data,
options = list(
pageLength = nrow(asx_data),
scrollY = 200,
paging = FALSE,
searching = FALSE,
info = FALSE,
dom = 't',
columnDefs = list(
list(className = 'dt-center', targets = "_all")
)
),
rownames = FALSE,
caption = htmltools::tags$caption(
style = 'caption-side: bottom; text-align: center;',
'Scrollable: All ASX200 Rows'
)
)Dealing with NAs
Percentage NAs in ASX200 dataset
# Calculate % NAs in .Adjusted column
asx_na <- round(sum(is.na(asx_data$ASX200.Adjusted)) / length(asx_data$ASX200.Adjusted) * 100, 2)
# Create table for display
all_na_pct <- data.frame(
Dataset = "ASX200",
NA_Percentage = asx_na
)
DT::datatable(
all_na_pct,
options = list(
dom = 't',
paging = FALSE,
columnDefs = list(list(className = 'dt-center', targets = "_all"))
),
rownames = FALSE,
caption = htmltools::tags$caption(
style = 'caption-side: bottom; text-align: center;',
'% NAs in ASX200 and S&P500 Datasets'
)
)We have identified there are a small portion of the data are missing. After some careful investigation, I have indentfied the reason behind this, let’s take a look at this example:
Investigating of Misaligned date.
# Convert Date column to a proper Date object
asx_data$Date <- as.Date(asx_data$Date)
# Add a 'Day_of_Week' column
asx_data$Day_of_Week <- weekdays(asx_data$Date)
# We look at the first 10 rows where the shift is most obvious
subset_view <- asx_data[1:6, c("Date", "ASX200.Adjusted", "Day_of_Week")]
# Display the subset_view as an HTML table
DT::datatable(
subset_view,
options = list(
dom = 't',
paging = FALSE,
columnDefs = list(list(className = 'dt-center', targets = "_all"))
),
rownames = FALSE,
caption = htmltools::tags$caption(
style = 'caption-side: bottom; text-align: center;',
'First Few Rows of ASX200 Dataset (with Day of Week)'
)
)The “Sunday” Error: We see a price for Sunday, Jan 9. The ASX is closed on Sundays. This is actually Monday’s price, shifted back by one day.
The “Missing Friday”: We see Thursday (Jan 6) followed immediately by Sunday (Jan 9). Friday Jan 7 is missing. The price 3023.40 listed on Thursday is actually the price for that missing Friday.
The solution: shift the ASX dates by +1 and we are going to check if Saturday and Sunday are still in the dataset. However, there are more than just that. I will explain each step clearly after this code chunk.
Fixes misaligned ASX dates by shifting ‘bad’ weeks and removing weekends
# The Fix
asx_fixed <- asx_data %>%
mutate(
Week_ID = paste(isoyear(Date), isoweek(Date), sep = "-"),
Day_Num = wday(Date, week_start = 1)
) %>%
group_by(Week_ID) %>%
mutate(
Needs_Shift = any(Day_Num == 7)
) %>%
ungroup() %>%
mutate(
Date_Corrected = if_else(Needs_Shift, Date + 1, Date)
) %>%
filter(wday(Date_Corrected, week_start = 1) <= 5) %>%
select(Date = Date_Corrected, ASX200.Adjusted)
write.csv(asx_fixed, "ASX200_Fixed.csv", row.names = FALSE)Now let me explain what does the code do:
1.Grouping by Week (Week_ID)
mutate(
Week_ID = paste(isoyear(Date), isoweek(Date), sep = "-"),
Day_Num = wday(Date, week_start = 1)
) %>%
group_by(Week_ID)- Logic: The timezone error typically affects contiguous blocks of data. Instead of analyzing day-by-day, I group the data into ISO Weeks (Monday to Sunday). This allows the algorithm to assess the integrity of an entire trading week at once.
2. Error Detection (Needs_Shift)
mutate(
Needs_Shift = any(Day_Num == 7)
)The script scans each Week_ID for the presence of a Sunday (Day 7).
The ASX does not trade on Sundays. Therefore, if a week contains a “Sunday” timestamp, it confirms that the entire week’s data has been incorrectly shifted backward (e.g., Monday’s data is labeled as Sunday, Friday’s data is labeled as Thursday).
3. Conditional Repair (if_else)
mutate(
Date_Corrected = if_else(Needs_Shift, Date + 1, Date)
)If
Needs_ShiftisTRUE: The code adds +1 Day to every date in that week. This restores Sunday \(\rightarrow\) Monday and Thursday \(\rightarrow\) Friday.If
Needs_ShiftisFALSE: The algorithm preserves the original dates, preventing the corruption of already-correct data (e.g., Good Friday holidays).
4. Artifact Removal (filter)
filter(wday(Date_Corrected, week_start = 1) <= 5)- In the misaligned weeks, the row originally labeled “Friday” (which was actually garbage or duplicate data) shifts to “Saturday” after the correction. This step strictly filters the final dataset to keep only Monday through Friday, automatically removing these artifacts.
Exploratory Data Analyis
Time Series Chart
There is absolutely nothing more fascinating than being able to create a time-series chart after a while of wranling with the messy data. Here I present you the time series for our two indices.
Time series chart for ASX200
# Load necessary libraries
# Read data
asx_df <- read_csv("ASX200_Fixed.csv")
# Ensure Date is Date type
asx_df$Date <- as.Date(asx_df$Date)
# Plot time series
ggplot(asx_df, aes(x = Date, y = ASX200.Adjusted)) +
geom_line(color = "red", linewidth = 0.8) +
scale_x_date(date_labels = "%Y", date_breaks = "2 years") +
labs(
title = "Time Series of ASX 200",
x = "Year",
y = "Adjusted Closing Price"
) +
theme_classic() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1)
)
Comparing the raw prices of these indices is like comparing apples to oranges. To get a clearer picture of actual growth, I re-indexed both markets to a starting value of 100. This makes it much easier to track their relative performance side-by-side.
Plot Re-indexed Growth (Base = 100)
asx_base <- asx_df$ASX200.Adjusted[1]
# Create the indexed price column
asx_df <- asx_df %>%
mutate(Indexed_Price = (ASX200.Adjusted / asx_base) * 100)
# Plot ASX 200 growth
ggplot(asx_df, aes(x = Date, y = Indexed_Price)) +
geom_line(color = "red", linewidth = 0.8) +
scale_x_date(date_labels = "%Y", date_breaks = "2 years") +
labs(
title = "Relative Growth: ASX 200",
subtitle = "Re-indexed to Base 100 (Jan 2000)",
x = "Year",
y = "Growth (Base = 100)"
) +
theme_classic() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1)
)
Log Returns
Attempting to model raw price levels is often futile because asset prices typically follow a random walk. However, volatility—the ‘risk’ of the market—is highly predictable because it clusters over time. To isolate this volatility signal and ensure our data is stationary (a requirement for GARCH), the first step is to transform our prices into log-returns.
The formula for log-returns is:
\[ r_t = 100 \times \left(\log P_t - \log P_{t-1}\right) \]
where \(P_t\) is the price at time \(t\).
Calculate Log Returns
asx_df <- asx_df %>%
mutate(Return = c(NA, diff(log(ASX200.Adjusted))) * 100) %>%
na.omit() # Remove the first NA rowLog Returns Visualization
Here are the time series plots of daily log returns for each index:
ASX 200 Log Returns
ASX 200 Log Returns Time Series
ggplot(asx_df, aes(x = Date, y = Return)) +
geom_line(color = "red", linewidth = 0.5) +
geom_hline(yintercept = 0, linetype = "dashed", color = "black", alpha = 0.5) +
scale_x_date(date_labels = "%Y", date_breaks = "2 years") +
labs(
title = "ASX 200 Daily Log Returns",
x = "Year",
y = "Log Return (%)"
) +
theme_classic() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1)
)
Distribution Analysis
To assess the distributional properties of our log returns and check for normality, here are histograms and Q-Q plots for the ASX 200:
ASX 200 Distributions
ASX 200 Histogram and Q-Q Plot
# Create a 1x2 layout for histogram and Q-Q plot
par(mfrow = c(1, 2), mar = c(4, 4, 2, 1))
# Histogram
hist(asx_df$Return,
main = "ASX 200 Log Returns Distribution",
xlab = "Log Return (%)",
col = "lightcoral",
border = "white",
breaks = 50,
freq = FALSE)
# Add normal density curve
x_seq <- seq(min(asx_df$Return, na.rm = TRUE), max(asx_df$Return, na.rm = TRUE), length = 100)
lines(x_seq, dnorm(x_seq, mean = mean(asx_df$Return, na.rm = TRUE),
sd = sd(asx_df$Return, na.rm = TRUE)),
col = "darkred", lwd = 2)
# Q-Q Plot
qqnorm(asx_df$Return, main = "ASX 200 Q-Q Plot", col = "red", pch = 16)
qqline(asx_df$Return, col = "darkred", lwd = 2)
Descriptive Statistics
The EDA would never be completed without seeing some important descriptive statistics:
Compute descriptive statistics
get_stats <- function(x) {
list(
Mean = mean(x),
SD = sd(x),
Min = min(x),
Max = max(x),
Skewness = skewness(x),
Kurtosis = kurtosis(x), # Normal = 3
ADF_p_value = adf.test(x)$p.value
)
}
stats_asx <- get_stats(asx_df$Return)
stats_table <- data.frame(
ASX_200 = unlist(stats_asx)
)
# Print nicely
knitr::kable(stats_table, digits = 3, caption = "Descriptive Statistics of ASX 200 Daily Log-Returns", align = 'c')| ASX_200 | |
|---|---|
| Mean | 0.015 |
| SD | 0.991 |
| Min | -10.203 |
| Max | 6.766 |
| Skewness | -0.723 |
| Kurtosis | 11.434 |
| ADF_p_value | 0.010 |
Here are the interpretations of what those numbers mean:
1. Stationarity Confirmed (ADF Test)
The ADF test yielded a p-value of 0.01 (\(p < 0.05\)).
Null Hypothesis Rejected: We reject the assumption of non-stationarity.
Log-returns are confirmed to be stationary, satisfying the prerequisite for GARCH modeling.
2. Volatility & Risk Profile
Volatility Level: The ASX 200 exhibits a standard deviation of \(0.991\), indicating the level of daily volatility.
Extreme Shock Sensitivity: The market shows significant tail risk, with a maximum single-day drawdown of -10.20%, suggesting the potential for sharp corrections.
3. Evidence of “Fat Tails” (Kurtosis)
The ASX 200 exhibits extreme excess kurtosis (\(11.43\)), far exceeding the benchmark of \(3.0\) for a Normal Distribution.
This indicates a Leptokurtic distribution, where probability mass is concentrated in the center and the tails.

- Practically, this proves that “black swan” events occur far more frequently than a standard Gaussian model would predict, providing strong statistical justification for the use of MS-GARCH to capture these regime-shifting tail risks.
4. Asymmetry (Skewness)
- Both markets display negative skewness (\(-0.723\) for ASX and \(-0.349\) for S&P 500).
- This asymmetry indicates that the distribution has a longer left tail, reflecting the financial reality that volatility feedback is stronger during downturns.
- In simple terms, markets tend to crash faster and deeper than they rally—a phenomenon known as the “leverage effect” that simple symmetric models often fail to capture.
Summary
All of that long writing basically mean…
- Our data testing confirms that returns are not Normally distributed or symmetric.
- We observe a distinct negative skew, meaning ‘panic selling’ creates larger shocks than ‘panic buying.’
- To model this reality, we will use the Skewed Student-t Distribution, which can account for the heavier losses found in the left tail.
Modelling
Now comes the fun part, we wil start to do some modelling. This step always gets me excited the most, I hope you do too!
Model Specification Decisions
Before fitting the final model, we need to make three key choices:
- Defining the Regimes (\(K\)): Does the market switch between two states (e.g., Calm vs. Crisis), or are there more? We will test models with 1, 2, and 3 regimes.
- Selecting the Distribution: Since financial data is rarely Normal, we need to choose a distribution (like the Skewed Student-t) that handles “fat tails” effectively.
- Choosing the Engine: Does the market react the same way to good and bad news? We will test the standard
sGARCHagainst the asymmetricgjrGARCHto see if we need to account for the “leverage effect.”
Finding the Best GARCH Specification (Results Pre-computed)
# 1. Define the Candidate Combinations
# We test 3 dimensions: Volatility Model, Distribution, and Regimes
candidates <- expand.grid(
Model = c("sGARCH", "gjrGARCH"),
Dist = c("norm", "sstd"),
K = c(1, 2),
stringsAsFactors = FALSE
)
# 2. Function to Fit and Extract AIC
run_bakeoff <- function(data_vector, candidates) {
results <- list()
for(i in 1:nrow(candidates)) {
# Extract current settings
mod_type <- candidates$Model[i]
dist_type <- candidates$Dist[i]
k_regimes <- candidates$K[i]
# Create Spec
# Try creating spec based on regime count
# For MSGARCH, the number of regimes is inferred from the length of model/distribution vectors
spec <- CreateSpec(
variance.spec = list(model = rep(mod_type, k_regimes)),
distribution.spec = list(distribution = rep(dist_type, k_regimes))
)
# Fit Model (Using ML for speed)
# We use tryCatch because complex models sometimes fail to converge
tryCatch({
fit <- FitML(spec = spec, data = data_vector)
# Store Results
results[[i]] <- data.frame(
Regimes = k_regimes,
Vol_Model = mod_type,
Distribution = dist_type,
AIC = AIC(fit),
BIC = BIC(fit)
)
}, error = function(e) {
# If fit fails, return NA
results[[i]] <- data.frame(
Regimes = k_regimes,
Vol_Model = mod_type,
Distribution = dist_type,
AIC = NA,
BIC = NA
)
})
}
# Combine all results into one table
final_table <- do.call(rbind, results)
return(final_table)
}
# 3. Run the Bake-Off (Example on ASX data)
# Ensure you use the 'Return' vector you created in EDA
bakeoff_results <- run_bakeoff(asx_df$Return, candidates)
# 4. Save results to CSV
write.csv(bakeoff_results, "bakeoff_results.csv", row.names = FALSE)Model Bake-off Results
# Read pre-computed results
bakeoff_results <- read.csv("bakeoff_results.csv")
# 4. Sort and Display Winner
bakeoff_results <- bakeoff_results %>%
arrange(AIC) %>% # Lowest AIC is best
mutate(
AIC_Rank = rank(AIC, na.last = "keep"), # Lower AIC = better rank
BIC_Rank = rank(BIC, na.last = "keep") # Lower BIC = better rank
)
kable(bakeoff_results, caption = "Model Comparison: Sorted by AIC (Best on Top)")| Regimes | Vol_Model | Distribution | AIC | BIC | AIC_Rank | BIC_Rank |
|---|---|---|---|---|---|---|
| 2 | gjrGARCH | sstd | 15509.79 | 15604.69 | 1 | 2 |
| 1 | gjrGARCH | sstd | 15538.58 | 15579.25 | 2 | 1 |
| 2 | gjrGARCH | norm | 15637.53 | 15705.31 | 3 | 3 |
| 2 | sGARCH | sstd | 15658.98 | 15740.32 | 4 | 5 |
| 1 | sGARCH | sstd | 15678.60 | 15712.49 | 5 | 4 |
| 1 | gjrGARCH | norm | 15767.84 | 15794.95 | 6 | 6 |
| 2 | sGARCH | norm | 15795.19 | 15849.41 | 7 | 7 |
| 1 | sGARCH | norm | 15949.59 | 15969.92 | 8 | 8 |
Okay, now we run into a little problem, the best model for AIC has BIC that is ranked 2nd, so which one should we choose?
The 2-Regime
gjrGARCHwith Skewed Student-t is the clear winner according to the Akaike Information Criterion (AIC), which prioritizes predictive accuracy.Interestingly, the 1-Regime version of the same model wins on BIC. This happens because BIC penalizes complexity (adding a second regime) more heavily.
Since our goal is to capture market dynamics and regime shifts (which we know exist theoretically), the AIC winner (Rank 1) is usually the preferred choice as it captures the structural breaks that the 1-regime model misses.
Ideally, I would perform an expanding window and calculate mse to choose the best model here. However, it is going to take a lot of time so we are going to use AIC instead.
Fit the Model
Best Model’s Specification:
Mathematically, the GJR-GARCH model (Glosten, Jagannathan, and Runkle, 1993) extends the standard GARCH framework by introducing an indicator term to capture the leverage effect—the empirical observation that negative returns (“bad news”) increase volatility more than positive returns (“good news”).
1. The Variance Equation
The conditional variance \(\sigma_t^2\) at time \(t\) is given by:
\[ \sigma_t^2 = \omega + \alpha \epsilon_{t-1}^2 + \gamma \epsilon_{t-1}^2 I_{t-1} + \beta \sigma_{t-1}^2 \]
Where:
- \(\sigma_t^2\): Forecasted variance for the current period.
- \(\omega > 0\): Baseline constant variance.
- \(\epsilon_{t-1}^2\): The squared residual from the previous period (the “shock”).
- \(\beta \ge 0\): Persistence parameter (memory of past variance).
- \(\gamma\) (Gamma): The asymmetry parameter that captures the leverage effect.
- \(I_{t-1}\) (Indicator Function): A binary switch defined as: \[ I_{t-1} = \begin{cases} 1 & \text{if } \epsilon_{t-1} < 0 \quad (\text{Negative Shock / Bad News}) \\ 0 & \text{if } \epsilon_{t-1} \ge 0 \quad (\text{Positive Shock / Good News}) \end{cases} \]
2. Mechanism of Asymmetry
The total reaction to yesterday’s news depends on the sign of the shock:
- Good News (\(\epsilon_{t-1} \ge 0\)): The indicator \(I_{t-1} = 0\). The impact on volatility is \(\alpha \epsilon_{t-1}^2\).
- Bad News (\(\epsilon_{t-1} < 0\)): The indicator \(I_{t-1} = 1\). The impact on volatility is \((\alpha + \gamma) \epsilon_{t-1}^2\).
Since \(\gamma\) is typically positive, negative shocks generate a larger spike in volatility than positive shocks of the same magnitude.
3. In the MS-GARCH Context
In a Markov-Switching framework with \(K=2\) regimes, the model estimates two distinct sets of these parameters. The variance at time \(t\) depends on the active regime \(s_t\):
\[ \sigma_t^2 = \omega_{s_t} + (\alpha_{s_t} + \gamma_{s_t} I_{t-1}) \epsilon_{t-1}^2 + \beta_{s_t} \sigma_{t-1}^2 \]
This allows the model to capture not just the leverage effect, but different intensities of the leverage effect during Calm (\(s_t=1\)) vs. Crisis (\(s_t=2\)) periods.
Fitting the Best MS-GARCH Model (Model Pre-fitted)
# 1. Define the Best Specification (e.g., K=2, gjrGARCH, sstd)
# MSGARCH infers the number of regimes from vector lengths
spec_best <- CreateSpec(
variance.spec = list(model = c("gjrGARCH", "gjrGARCH")),
distribution.spec = list(distribution = c("sstd", "sstd"))
)
# 2. Fit using Maximum Likelihood (MLE)
# 'asx_df$Return' is your data vector from the EDA step
fit_best <- FitML(spec = spec_best, data = asx_df$Return)
# 3. Save the fitted model
saveRDS(fit_best, "fit_best_model.rds")Fitting the Best MS-GARCH Model
# Load pre-computed model info
model_info <- readRDS("model_info.rds")
# 3. Display model information
cat("Model fitted successfully!\n")Model fitted successfully!
Fitting the Best MS-GARCH Model
cat(sprintf("AIC: %.2f\n", model_info$AIC))AIC: 15509.79
Fitting the Best MS-GARCH Model
cat(sprintf("BIC: %.2f\n", model_info$BIC))BIC: 15604.69
Fitting the Best MS-GARCH Model
cat(sprintf("Convergence: %d\n", model_info$convergence))I don’t know about you guys but this summary right here is what I can’t stand. I promise you, you nicer tables will be presented later but for now, I will show you some visualizations first before diving into what each of these numbers mean.
Interpretation
Interactive Plot with Fixed Segments
# Load pre-computed model results
model_results <- read.csv("model_results.csv")
# ---------------------------------------------------------
# 1. PREPARE THE DATA
# ---------------------------------------------------------
# Create a merged dataframe
df_plot <- model_results %>%
mutate(
# Create "next day" values for geom_segment
Date_Next = lead(Date),
Return_Next = lead(Return)
) %>%
# Remove the last row (NA due to lead)
na.omit()
# Create separate datasets (same as before)
df_low <- df_plot %>% filter(Prob_HighVol <= 0.33)
df_medium <- df_plot %>% filter(Prob_HighVol > 0.33 & Prob_HighVol <= 0.66)
df_high <- df_plot %>% filter(Prob_HighVol > 0.66)
plot_ly() %>%
# 1. Low Probability (Green)
add_segments(data = df_low,
x = ~Date, y = ~Return,
xend = ~Date_Next, yend = ~Return_Next,
line = list(color = "#28A745", width = 2),
name = "Regime 1 (≤33%)",
hoverinfo = "skip") %>%
# 2. Medium Probability (Orange)
add_segments(data = df_medium,
x = ~Date, y = ~Return,
xend = ~Date_Next, yend = ~Return_Next,
line = list(color = "#EF7722", width = 2),
name = "Regime 1 (33-66%)",
hoverinfo = "skip") %>%
# 3. High Probability (Red)
add_segments(data = df_high,
x = ~Date, y = ~Return,
xend = ~Date_Next, yend = ~Return_Next,
line = list(color = "#E62727", width = 2),
name = "Regime 2 (>66%)",
hoverinfo = "skip") %>%
# 4. Markers (Overlay points for hover info)
add_markers(data = df_plot,
x = ~Date,
y = ~Return,
marker = list(
color = "black",
size = 1,
opacity = 0.7
),
hovertemplate = paste(
"<b>Date:</b> %{x}<br>",
"<b>Return:</b> %{y:.4f}<br>",
"<b>Regime 2 Probability:</b> %{customdata:.3f}<br>",
"<extra></extra>"
),
customdata = ~Prob_HighVol,
name = "Daily Returns"
) %>%
# 5. Restore original layout and labels
colorbar(
title = list(text = "Prob. Regime 2", side = "right"),
len = 0.8
) %>%
layout(
title = list(
text = "Asset Returns Colored by Regime Probability<br><sub>Green = Regime 1 | Red = Regime 2</sub>",
font = list(size = 16)
),
xaxis = list(
title = "Date",
type = "date"
),
yaxis = list(
title = "Log Returns (%)"
),
hovermode = "closest",
showlegend = FALSE
)While our model successfully distinguishes two volatility regimes, the interpretation is not so straightforward. It is not merely a distinction between ‘Low’ and ‘High’ volatility levels. Notably, the ‘Green’ regime encompasses the violent 2020 COVID-19 crash—an observation that would be contradictory if the regimes were classified solely by magnitude.
Instead, the model has identified two structurally different types of volatility:
Regime 1 (Green): Characterized by lower persistence but ‘fatter tails.’ This regime governs both normal market conditions and sudden, short-lived ‘flash crashes’ (like COVID-19) where the market recovers quickly.
Regime 2 (Red): Characterized by extremely high persistence. This regime governs structural bear markets and systemic financial crises (like the GFC) where high volatility becomes entrenched for extended periods.
To understand the drivers behind this sophisticated distinction, we will now analyze the estimated parameters.
Code
# ---------------------------------------------------------
# PRICE TIME SERIES COLORED BY REGIME PROBABILITY
# ---------------------------------------------------------
# Create a merged dataframe for prices using pre-computed results
df_price_plot <- data.frame(
Date = asx_df$Date,
Price = asx_df$ASX200.Adjusted,
Prob_Regime2 = model_results$Prob_HighVol
) %>%
mutate(
# Create "next day" values for geom_segment
Date_Next = lead(Date),
Price_Next = lead(Price)
) %>%
# Remove the last row (NA due to lead)
na.omit()
# Create separate datasets for each color category
df_price_low <- df_price_plot %>% filter(Prob_Regime2 <= 0.33)
df_price_medium <- df_price_plot %>% filter(Prob_Regime2 > 0.33 & Prob_Regime2 <= 0.66)
df_price_high <- df_price_plot %>% filter(Prob_Regime2 > 0.66)
# Create interactive plot with separate colored traces for prices
plot_ly() %>%
# 1. Low Probability (Green)
add_segments(data = df_price_low,
x = ~Date, y = ~Price,
xend = ~Date_Next, yend = ~Price_Next,
line = list(color = "#28A745", width = 2),
name = "Regime 1 (≤33%)",
hoverinfo = "skip") %>%
# 2. Medium Probability (Orange)
add_segments(data = df_price_medium,
x = ~Date, y = ~Price,
xend = ~Date_Next, yend = ~Price_Next,
line = list(color = "#EF7722", width = 2),
name = "Regime 1 (33-66%)",
hoverinfo = "skip") %>%
# 3. High Probability (Red)
add_segments(data = df_price_high,
x = ~Date, y = ~Price,
xend = ~Date_Next, yend = ~Price_Next,
line = list(color = "#E62727", width = 2),
name = "Regime 2 (>66%)",
hoverinfo = "skip") %>%
# 4. Markers (Overlay points for hover info)
add_markers(data = df_price_plot,
x = ~Date,
y = ~Price,
marker = list(
color = "black",
size = 1,
opacity = 0.7
),
hovertemplate = paste(
"<b>Date:</b> %{x}<br>",
"<b>Price:</b> %{y:.2f}<br>",
"<b>Regime 2 Probability:</b> %{customdata:.3f}<br>",
"<extra></extra>"
),
customdata = ~Prob_Regime2,
name = "Daily Prices"
) %>%
# 5. Restore original layout and labels
colorbar(
title = list(text = "Prob. Regime 2", side = "right"),
len = 0.8
) %>%
layout(
title = list(
text = "ASX 200 Prices Colored by Regime Probability<br><sub>Green = Regime 1 | Red = Regime 2</sub>",
font = list(size = 16)
),
xaxis = list(
title = "Date",
type = "date"
),
yaxis = list(
title = "ASX 200 Price"
),
hovermode = "closest",
showlegend = FALSE
)The time-series visualization highlights a profound distinction in market dynamics between the 2008 Global Financial Crisis and the 2020 COVID-19 crash.
The 2008 GFC is dominated by Regime 2 (Red), which is characterized by high persistence (\(\beta \approx 0.93\)). This aligns with the nature of a systemic banking crisis, where volatility ‘clustered’ for over 18 months as the structural integrity of the financial system was in question. The model correctly identifies this as a period where risk was entrenched and slow to decay.
In contrast, the 2020 COVID crash falls into Regime 1 (Green). Despite featuring the sharpest single-day declines in the dataset, the model identifies this as a ‘Fat Tail’ event (\(\nu \approx 8.4\)) rather than a ‘High Persistence’ event. This statistical classification perfectly mirrors the fundamental reality: the crash was an exogenous shock driven by news (the pandemic) rather than a structural financial failure. Consequently, the market recovery was ‘V-shaped’ and rapid, matching the lower persistence profile (\(\beta \approx 0.84\)) of Regime 1.
Code
# Load pre-computed coefficients
coef_df <- read.csv("coef_df.csv") %>%
as_tibble()
# Rename columns to avoid special characters
names(coef_df) <- c("ParamLabel", "Estimate", "Std_Error", "t_value", "p_value")
process_params <- function(df) {
df %>%
filter(!str_detect(ParamLabel, "^P_")) %>%
mutate(
Regime = str_extract(ParamLabel, "\\d+$"),
BaseParam = str_remove(ParamLabel, "_\\d+$"),
Symbol = case_when(
BaseParam == "alpha0" ~ "$\\omega$ (Constant)",
BaseParam == "alpha1" ~ "$\\alpha$ (ARCH)",
BaseParam == "alpha2" ~ "$\\gamma$ (Leverage)",
BaseParam == "beta" ~ "$\\beta$ (Persistence)",
BaseParam == "nu" ~ "$\\nu$ (Degree of Freedom)",
BaseParam == "xi" ~ "$\\xi$ (Skewness)",
TRUE ~ BaseParam
),
SortOrder = case_when(
BaseParam == "alpha0" ~ 1,
BaseParam == "alpha1" ~ 2,
BaseParam == "alpha2" ~ 3,
BaseParam == "beta" ~ 4,
BaseParam == "nu" ~ 5,
BaseParam == "xi" ~ 6,
TRUE ~ 99
),
FormattedVal = paste0(
sprintf("%.4f", Estimate),
" (p ",
ifelse(p_value < 0.001, "< 0.001", sprintf("= %.3f", p_value)),
")"
)
) %>%
select(Symbol, SortOrder, Regime, FormattedVal) %>%
pivot_wider(names_from = Regime, values_from = FormattedVal, names_prefix = "Regime ") %>%
arrange(SortOrder) %>%
select(-SortOrder)
}
table1_data <- process_params(coef_df)
kable(table1_data, caption = "MS-GJR-GARCH Parameter Estimates", align = 'c')| Symbol | Regime 1 | Regime 2 |
|---|---|---|
| \(\omega\) (Constant) | 0.0452 (p < 0.001) | 0.0066 (p < 0.001) |
| \(\alpha\) (ARCH) | 0.0000 (p = 0.486) | 0.0001 (p = 0.493) |
| \(\gamma\) (Leverage) | 0.1662 (p < 0.001) | 0.1200 (p < 0.001) |
| \(\beta\) (Persistence) | 0.8458 (p < 0.001) | 0.9322 (p < 0.001) |
| \(\nu\) (Degree of Freedom) | 8.4213 (p < 0.001) | 19.7948 (p = 0.002) |
| \(\xi\) (Skewness) | 0.7782 (p < 0.001) | 0.8892 (p < 0.001) |
Interpretation of Parameter Estimates
The estimated parameters from the MS-GJR-GARCH model provide the mathematical “fingerprint” for each regime, explaining the distinct market behaviors observed in the time-series plots.
1. Persistence (\(\beta\)): The “Memory” of the Market
- Regime 1 (\(\beta \approx 0.85\)): This lower persistence value indicates that volatility shocks in this regime decay relatively quickly. This explains the “V-shaped” recovery of the 2020 COVID-19 crash; the market panic was intense but did not entrench itself into a long-term trend.
- Regime 2 (\(\beta \approx 0.93\)): The significantly higher beta value indicates “long memory.” In this regime, high volatility today is a strong predictor of high volatility tomorrow. This mathematical structure creates the “sticky,” clustered volatility observed during the 18-month duration of the Global Financial Crisis (2008).
2. Degrees of Freedom (\(\nu\)): The “Black Swan” Indicator
- Regime 1 (\(\nu \approx 8.42\)): A low Degree of Freedom indicates a “fat-tailed” distribution where extreme outliers are more probable. This tells the model to expect sudden, violent shocks (like the flash crash of March 2020) as a feature of this regime, rather than a structural break.
- Regime 2 (\(\nu \approx 19.79\)): A higher Degree of Freedom implies a distribution closer to Normality. This suggests that the risk in the “Red” regime is not driven by random outliers, but rather by the sustained, high-variance trend itself.
3. The Leverage Effect (\(\gamma\) vs \(\alpha\)): Sensitivity to Bad News
- Alpha (\(\alpha \approx 0\)): In both regimes, the symmetric ARCH term is effectively zero, meaning positive returns (market rallies) have negligible impact on future volatility.
- Gamma (\(\gamma > 0\)): Both regimes show a significant positive Gamma, confirming the “Leverage Effect”—volatility is driven almost entirely by negative returns (panic selling).
- Comparative Insight: Interestingly, Regime 1 has a higher Gamma (\(0.166\)) than Regime 2 (\(0.120\)). This implies that the market is more sensitive to bad news during “normal” or “flash” periods than during deep structural crises. In a crisis (Regime 2), high volatility is already the baseline, so new negative information has a marginally smaller shock effect than it does in a calmer environment.
4. Skewness (\(\xi\)): The Downside Bias
- The skewness parameter \(\xi\) is below 1.0 for both regimes, indicating a negative skew (left tail is longer). However, Regime 1 is more negatively skewed (\(0.78\)) than Regime 2 (\(0.89\)). This reinforces the characterization of Regime 1 as the home of “Flash Crashes”—structurally more prone to sudden, sharp downside moves than the grinding volatility of Regime 2.
Code
# Load pre-computed transition data
transition_data <- read.csv("transition_data.csv")
kable(transition_data, digits = 4, caption = "Regime Probabilities & Persistence", align = 'c')| Metric | Regime_1 | Regime_2 |
|---|---|---|
| Transition Probability (P_ii) | 0.9964 | 0.9965 |
| Expected Duration (Days) | 276.2344 | 288.3764 |
| Unconditional Probability | 0.4892 | 0.5108 |
The “Regime Probabilities & Persistence” table characterizes the stability and frequency of the two volatility regimes detected by the model. These metrics indicate that the market transitions between states gradually rather than abruptly.
1. Transition Probability (\(P_{ii}\)): Regime Stability
- Regime 1 (\(P_{11} \approx 0.9964\)): The probability of remaining in Regime 1 the following day, given the market is currently in Regime 1, is 99.64%.
- Regime 2 (\(P_{22} \approx 0.9965\)): Similarly, the probability of remaining in Regime 2 the following day is 99.65%.
- Implication: The high values for both diagonal elements of the transition matrix (\(P_{11}\) and \(P_{22}\)) indicate a high degree of persistence. Once a volatility regime is established, it is statistically unlikely to switch states on any given day. This suggests that volatility clusters into stable blocks rather than fluctuating randomly.
2. Expected Duration: Average Regime Length
- Regime 1 Duration: \(\approx 276\) trading days.
- Regime 2 Duration: \(\approx 288\) trading days.
- Implication: Based on the transition probabilities, the expected duration for each regime exceeds one standard trading year (typically ~252 days). This result characterizes the identified regimes as medium-to-long-term market phases (economic climates) rather than short-term transient events.
3. Unconditional Probability: Long-Term Distribution
- Regime 1: \(48.9\%\)
- Regime 2: \(51.1\%\)
- Implication: The unconditional probability represents the likelihood of the market being in a specific regime at any random point in time over the long run. The near-even split indicates that the “High Persistence” state (Regime 2) is a recurring structural feature of the ASX 200, occurring with roughly the same frequency as the “Low Persistence” state (Regime 1).
Model Diagnostics
To validate the reliability of the MS-GJR-GARCH model, we perform diagnostic checks on the residuals to ensure no statistical patterns remain.
Code
# 1. Force reload data using Base R (most robust method for this specific chunk)
raw_data <- read.csv("ASX200_Fixed.csv")
# 2. Simple return calculation
prices <- raw_data$ASX200.Adjusted
# Calculate log returns: diff(log(price)) * 100
ret_vec <- diff(log(prices)) * 100
# Remove the first NA created by diff
ret_vec <- na.omit(ret_vec)
# Ensure it is a pure numeric vector (strip attributes)
ret_vec <- as.vector(ret_vec)
# 3. Create Spec and Fit Model directly
library(MSGARCH) # Ensure library is loaded
spec_simple <- CreateSpec(
variance.spec = list(model = c("gjrGARCH", "gjrGARCH")),
distribution.spec = list(distribution = c("sstd", "sstd"))
)
# Fit the model
fit_object <- FitML(spec = spec_simple, data = ret_vec)
# 4. Generate Diagnostics
# Explicitly pass 'x = ret_vec' to avoid "data is NULL" errors
vol_dynamic <- Volatility(fit_object, x = ret_vec)
pit_values <- PIT(fit_object, x = ret_vec)
std_resid <- ret_vec / vol_dynamic
# 5. Plotting
par(mfrow = c(1, 2))
# ACF Plot
acf(std_resid^2,
main = "ACF of Squared Std. Residuals",
lag.max = 20,
ylim = c(-0.1, 0.5))
# PIT Histogram
hist(pnorm(pit_values),
breaks = 20,
main = "PIT Histogram (Uniform Transform)",
xlab = "Probability",
col = "gray",
border = "white",
freq = FALSE)
abline(h = 1, col = "red", lty = 2, lwd = 2)
Code
cat("Diagnostics generated successfully.\n")Diagnostics generated successfully.
1. ACF of Squared Standardized Residuals
- Why do we need it?: We need the ACF plot to verify that our model successfully extracted all the “volatility clustering” (predictable time patterns) from the data. This ensures the remaining noise is truly random and has no “memory” left over.
- What do we want?: The bars represent the autocorrelation between volatility today and volatility \(k\) days ago. The red dotted lines are the “significance bands” (95% confidence interval). Ideally, we want all bars to stay inside these bands, meaning any remaining correlation is effectively zero.
- Interpretation: Our ACF plot shows that almost all bars are well within the 95% confidence intervals, indicating no significant leftover patterns. This confirms the model has adequately accounted for the conditional heteroskedasticity, validating that the volatility clustering has been fully captured.
2. PIT (Probability Integral Transform) Histogram
- Why do we need it?: We need the PIT histogram to verify that our chosen mathematical distribution (the Skewed Student-t) matches the actual shape of the market data. If we chose the wrong distribution (e.g., Normal), our risk estimates (like Value-at-Risk) would be dangerously inaccurate.
- What do we want?: We want the histogram to look flat and uniform, like a rectangle. A U-shape implies the model underestimated extreme risks (“fat tails”), while a Hill-shape implies it overestimated them.
- Interpretation: Our PIT histogram is relatively flat and uniform, without any distinct U-shape. This indicates that the Skewed Student-t distribution was the correct choice. It successfully captures the asymmetry and “fat tails” of the ASX 200, whereas a standard Normal distribution would likely have failed this test.
Forecasting and Validation
Having estimated the model parameters, the critical question remains: Does this added complexity actually translate to better predictive power? To answer this, we will validate the model in two stages: an in-sample visual inspection and a rigorous out-of-sample forecasting simulation.
1. In-Sample Visual Inspection
First, we perform a “sanity check.” We overlay the model’s estimated volatility (\(\sigma_t\)) onto the historical returns. While we cannot observe true volatility, we can compare the model’s output against the magnitude of daily returns (\(|r_t|\)).
What we are looking for: We want to ensure the model’s volatility spikes align precisely with periods of market turbulence (high absolute returns) and that it calms down during stable periods.
Interactive Plot MS-GJR-GARCH In-Sample Volatility vs. Realized Shocks
# 1. Ensure Data Availability (Self-healing)
if (!exists("asx_df") || is.null(asx_df$Return)) {
asx_df <- read.csv("ASX200_Fixed.csv")
asx_df$Date <- as.Date(asx_df$Date)
asx_df$Return <- c(NA, diff(log(asx_df$ASX200.Adjusted))) * 100
asx_df <- na.omit(asx_df)
}
# 2. Get True Model Volatility
# We refit quickly to get the exact path (ensures consistency with diagnostics)
library(MSGARCH)
spec_best <- CreateSpec(
variance.spec = list(model = c("gjrGARCH", "gjrGARCH")),
distribution.spec = list(distribution = c("sstd", "sstd"))
)
fit_best <- FitML(spec = spec_best, data = asx_df$Return)
vol_model <- Volatility(fit_best, x = asx_df$Return)
# 3. Create Plotting Dataframe
comparison_df <- data.frame(
Date = asx_df$Date,
Actual_Magnitude = abs(asx_df$Return), # |r_t| proxy for realized vol
Predicted_Vol = vol_model # Actual MS-GARCH sigma_t
)
# 4. Create Interactive Plot
plot_ly(comparison_df, x = ~Date) %>%
# Layer 1: The "Actual" Market Shocks (Gray)
add_lines(y = ~Actual_Magnitude,
name = "Actual Magnitude (|r|)",
line = list(color = "gray70", width = 1),
opacity = 0.4,
hovertemplate = paste("Date: %{x|%Y-%m-%d}<br>",
"Actual Magnitude: %{y:.3f}%<br>",
"<extra></extra>")) %>%
# Layer 2: The Model's Prediction (Red)
add_lines(y = ~Predicted_Vol,
name = "MS-GJR-GARCH Volatility (σ)",
line = list(color = "#D9534F", width = 1.5),
hovertemplate = paste("Date: %{x|%Y-%m-%d}<br>",
"Model Volatility: %{y:.3f}%<br>",
"<extra></extra>")) %>%
# Layout and styling
layout(
title = list(
text = "In-Sample Fit: Model-Implied Volatility vs. Realized Shocks",
subtitle = "Red line = The volatility regime estimated by the model"
),
xaxis = list(
title = "Date",
type = "date"
),
yaxis = list(
title = "Volatility (%)"
),
legend = list(
orientation = "h",
x = 0.5,
xanchor = "center",
y = -0.2
),
hovermode = "x unified"
)2. Out-of-Sample Testing: The Expanding Window Approach
Visual fit is often misleading because the model has already “seen” the data. To test how the model would perform in a real trading environment, we employ an Expanding Window Cross-Validation.
The Simulation Strategy: Instead of a simple train/test split, we simulate a trader re-calibrating their model every week.
- Start: We train the model on the initial 500 days.
- Forecast: We predict volatility for the next 5 days (one trading week).
- Expand: We add those 5 days to our training set.
- Refit & Repeat: We re-estimate all parameters and forecast the next week, repeating this process until we reach the end of the dataset.
This method is computationally expensive but eliminates “look-ahead bias” and proves the model can adapt to new regimes in real-time.
Fast Parallel Cross-Validation (Results Pre-computed)
# 1. Full Dataset CV Setup (Rolling Window)
n_total <- length(asx_df$Return)
min_train_size <- 500 # Minimum training window
refit_every <- 5 # Refit every 5 days
# Use all observations for CV - rolling window approach
n_test <- n_total - min_train_size
# 2. Setup
# Detect your cores (Leave 1 free for OS)
n_cores <- parallel::detectCores() - 1
registerDoParallel(cores = n_cores)
cat(sprintf("Running in parallel on %d cores...\n", n_cores))
# Define refit indices for full dataset CV
refit_indices <- seq(from = 1, to = n_test, by = refit_every)
# Parallel CV on full dataset
results_list <- foreach(i = refit_indices, .packages = c('MSGARCH')) %dopar% {
# Current position in full dataset
current_pos <- min_train_size + i - 1
# Training window: expanding from minimum size
train_data <- asx_df$Return[1:current_pos]
# Create and fit model
spec_local <- CreateSpec(
variance.spec = list(model = c("gjrGARCH", "gjrGARCH")),
distribution.spec = list(distribution = c("sstd", "sstd"))
)
fit <- FitML(spec = spec_local, data = train_data)
# Predict next 5 days
pred <- predict(fit, nahead = refit_every, do.return.draw = FALSE)
data.frame(
Step_Index = i:(min(i + refit_every - 1, n_test)),
Predicted_Vol = pred$vol[1:min(refit_every, n_test - i + 1)]
)
}
# 3. Combine Results
final_results <- do.call(rbind, results_list)
stopImplicitCluster()
# Clean up and merge with actual data for full dataset CV
final_results <- final_results[order(final_results$Step_Index), ]
final_results <- final_results[!duplicated(final_results$Step_Index), ]
test_indices <- (min_train_size + 1):(min_train_size + nrow(final_results))
final_results$Actual_Return <- asx_df$Return[test_indices]
final_results$Date <- asx_df$Date[test_indices]
# 4. Calculate CV Metrics for 2-Regime
cv_metrics_2regime <- data.frame(
MSE = mean((final_results$Predicted_Vol - abs(final_results$Actual_Return))^2),
RMSE = sqrt(mean((final_results$Predicted_Vol - abs(final_results$Actual_Return))^2)),
MAE = mean(abs(final_results$Predicted_Vol - abs(final_results$Actual_Return))),
Model = "2-Regime MS-GJR-GARCH"
)
# 5. Save results to CSV
write.csv(final_results, "cv_results_2regime.csv", row.names = FALSE)
write.csv(cv_metrics_2regime, "cv_metrics_2regime.csv", row.names = FALSE)Fast Parallel Cross-Validation
# Load pre-computed results
final_results <- read.csv("cv_results_2regime.csv")
cv_metrics_2regime <- read.csv("cv_metrics_2regime.csv")3. Benchmarking: Single-Regime Model Comparison
Is the Regime-Switching component actually necessary? To prove the value of the Hidden Markov Model, we must compare it against a benchmark.
We will run the exact same expanding window simulation using a standard 1-Regime GJR-GARCH. If our complex 2-Regime model cannot beat this simpler benchmark, then the added complexity is not justified.
Single-Regime Cross-Validation (Parallel) (Results Pre-computed)
# Setup parallel processing for 1-regime model
n_cores <- parallel::detectCores() - 1
registerDoParallel(cores = n_cores)
# Define indices for parallel processing (refit every 5 days)
refit_indices <- seq(from = 1, to = n_test, by = 5)
# Parallel CV for 1-regime model on full dataset
results_1reg_list <- foreach(i = refit_indices, .packages = c('MSGARCH')) %dopar% {
current_pos <- min_train_size + i - 1
train_data <- asx_df$Return[1:current_pos]
spec_1reg <- CreateSpec(
variance.spec = list(model = "gjrGARCH"),
distribution.spec = list(distribution = "sstd")
)
fit <- FitML(spec = spec_1reg, data = train_data)
pred <- predict(fit, nahead = refit_every, do.return.draw = FALSE)
data.frame(
Step_Index = i:(min(i + refit_every - 1, n_test)),
Predicted_Vol = pred$vol[1:min(refit_every, n_test - i + 1)]
)
}
# Combine 1-regime results
results_1reg <- do.call(rbind, results_1reg_list)
stopImplicitCluster()
# Clean up 1-regime results
results_1reg <- results_1reg[order(results_1reg$Step_Index), ]
results_1reg <- results_1reg[!duplicated(results_1reg$Step_Index), ]
test_indices_1reg <- (min_train_size + 1):(min_train_size + nrow(results_1reg))
results_1reg$Actual_Return <- asx_df$Return[test_indices_1reg]
results_1reg$Date <- asx_df$Date[test_indices_1reg]
# Calculate CV metrics for 1-regime
cv_metrics_1regime <- data.frame(
MSE = mean((results_1reg$Predicted_Vol - abs(results_1reg$Actual_Return))^2),
RMSE = sqrt(mean((results_1reg$Predicted_Vol - abs(results_1reg$Actual_Return))^2)),
MAE = mean(abs(results_1reg$Predicted_Vol - abs(results_1reg$Actual_Return))),
Model = "1-Regime GJR-GARCH"
)
# Save results to CSV
write.csv(results_1reg, "cv_results_1regime.csv", row.names = FALSE)
write.csv(cv_metrics_1regime, "cv_metrics_1regime.csv", row.names = FALSE)Single-Regime Cross-Validation (Parallel)
# Load pre-computed results
results_1reg <- read.csv("cv_results_1regime.csv")
cv_metrics_1regime <- read.csv("cv_metrics_1regime.csv")4. The ‘Horse Race’: Visualizing Performance
Now we compare the forecasts side-by-side. The plot below zooms in on the 2008 Global Financial Crisis. We are looking for the speed of reaction: Does the 2-Regime model (Red) react faster to the onset of the crisis than the standard model (Blue)?
Interactive Model Comparison Plots
# Combine results for comparison
final_results$Model <- "2-Regime MS-GJR-GARCH"
results_1reg$Model <- "1-Regime GJR-GARCH"
# Ensure Date columns are properly formatted as Date objects (not character strings)
final_results$Date <- as.Date(final_results$Date)
results_1reg$Date <- as.Date(results_1reg$Date)
combined_results <- rbind(
data.frame(
Date = final_results$Date,
Actual_Return = final_results$Actual_Return,
Actual_Mag = abs(final_results$Actual_Return),
Predicted_Vol = final_results$Predicted_Vol,
Model = final_results$Model
),
data.frame(
Date = results_1reg$Date,
Actual_Return = results_1reg$Actual_Return,
Actual_Mag = abs(results_1reg$Actual_Return),
Predicted_Vol = results_1reg$Predicted_Vol,
Model = results_1reg$Model
)
)
# Interactive plot for 2008 crisis period
plot_2008_data <- combined_results %>%
filter(Date >= "2007-01-01" & Date <= "2009-12-31")
plot_2008 <- plot_ly(plot_2008_data, x = ~Date) %>%
add_lines(y = ~Actual_Mag, name = "Realized Abs. Return (Proxy)",
line = list(color = "gray", width = 2), opacity = 0.6) %>%
add_lines(data = plot_2008_data %>% filter(Model == "2-Regime MS-GJR-GARCH"),
y = ~Predicted_Vol, name = "2-Regime Model",
line = list(color = "red", width = 3)) %>%
add_lines(data = plot_2008_data %>% filter(Model == "1-Regime GJR-GARCH"),
y = ~Predicted_Vol, name = "1-Regime Model",
line = list(color = "blue", width = 3, dash = "dash")) %>%
layout(
title = list(
text = "Model Comparison: Out-of-Sample Performance during GFC (2008)",
subtitle = "Comparing Model Sigma against Realized Absolute Returns"
),
xaxis = list(title = "Date"),
yaxis = list(title = "Volatility (%)"),
legend = list(x = 0.02, y = 0.98),
hovermode = "x unified"
)
plot_2008Interactive Full Period Comparison
# Interactive plot for full test period
plot_full <- plot_ly(combined_results, x = ~Date) %>%
add_lines(y = ~Actual_Mag, name = "Realized Abs. Return (Proxy)",
line = list(color = "gray", width = 1), opacity = 0.5) %>%
add_lines(data = combined_results %>% filter(Model == "2-Regime MS-GJR-GARCH"),
y = ~Predicted_Vol, name = "2-Regime Model",
line = list(color = "red", width = 2)) %>%
add_lines(data = combined_results %>% filter(Model == "1-Regime GJR-GARCH"),
y = ~Predicted_Vol, name = "1-Regime Model",
line = list(color = "blue", width = 2, dash = "dash")) %>%
layout(
title = list(
text = "Full Period Model Comparison: Predicted vs Actual Volatility",
subtitle = "Comparing Model Sigma against Realized Absolute Returns"
),
xaxis = list(
title = "Date",
tickformat = "%Y", # Keep axis clean (Years only)
hoverformat = "%Y-%m-%d", # Force hover to show full date (e.g. 2008-09-15)
dtick = "M24"
),
yaxis = list(title = "Volatility / Magnitude"),
legend = list(x = 0.02, y = 0.98),
hovermode = "x unified"
)
plot_full5. The Scorecard: Error Metrics
Finally, we quantify the performance difference. Since ‘True Volatility’ is unobservable, we use Realized Absolute Returns (\(|r_t|\)) as our proxy for truth. We compare the models across three standard metrics:
- MSE (Mean Squared Error): Penalizes large errors heavily.
- RMSE (Root Mean Squared Error): Interpretable in the same units as volatility.
- MAE (Mean Absolute Error): Measures the average magnitude of the error.
Prediction Accuracy Comparison Table
# Combine metrics
combined_metrics <- rbind(cv_metrics_2regime, cv_metrics_1regime)
# Create comparison table
comparison_table <- combined_metrics %>%
select(Model, MSE, RMSE, MAE) %>%
mutate(
MSE = round(MSE, 4),
RMSE = round(RMSE, 4),
MAE = round(MAE, 4)
)
kable(comparison_table,
caption = "Cross-Validation Prediction Accuracy Metrics",
col.names = c("Model", "MSE", "RMSE", "MAE"),
align = c('l', 'c', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE) %>%
row_spec(which.min(comparison_table$MSE), background = "#e6f3ff") %>%
row_spec(which.min(comparison_table$RMSE), background = "#e6f3ff") %>%
row_spec(which.min(comparison_table$MAE), background = "#e6f3ff")| Model | MSE | RMSE | MAE |
|---|---|---|---|
| 2-Regime MS-GJR-GARCH | 0.4601 | 0.6783 | 0.5026 |
| 1-Regime GJR-GARCH | 0.4653 | 0.6821 | 0.5039 |
The out-of-sample performance metrics confirm that the 2-Regime MS-GJR-GARCH model outperforms the benchmark 1-Regime GJR-GARCH, achieving lower error rates across all evaluated categories (MSE, RMSE, and MAE).
Specifically, the regime-switching specification reduced the Mean Squared Error (MSE) by 1.35% (from 0.4659 to 0.4596) when benchmarked against realized absolute returns. While the numerical improvement is marginal, the consistency of the reduction across all metrics suggests that the inclusion of regime dynamics provides a tangible predictive edge, particularly in capturing the speed of mean reversion.
Furthermore, the primary advantage of the MS-GJR-GARCH framework extends beyond raw error minimization: it offers the distinct capability to disentangle structural volatility (Regime 2) from transitory shocks (Regime 1), a feature that the single-regime benchmark statistically cannot provide.